Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Res Sq ; 2023 Sep 08.
Artigo em Inglês | MEDLINE | ID: mdl-37720037

RESUMO

Initially, research disciplines operated independently, but the emergence of trans-disciplinary sciences led to convergence research, impacting graduate programs and research laboratories, especially in bioengineering and material engineering as presented here. Current graduate curriculum fails to efficiently prepare students for multidisciplinary and convergence research, thus creating a gap between the students and research laboratory expectations. We present a convergence training framework for graduate students, incorporating problem-based learning under the guidance of senior scientists and collaboration with postdoctoral researchers. This case study serves as a template for transdisciplinary convergent training projects - bridging the expertise gap and fostering successful convergence learning experiences in computational biointerface (material-biology interface). The 18-month Advanced Data Science Workshop, initiated in 2019, involves project-based learning, online training modules, and data collection. A pilot solution utilized Jupyter notebook on Google collaborator and culminated in a face-to-face workshop where project presentations and finalization occurred. The program started with 9 experts in the four diverse fields creating 14 curated projects in data science (Artificial Intelligence/Machine Learning), material science, biofilm engineering, and biointerface. These were integrated into convergence research through webinars by the experts. The experts chose 8 of the 14 projects to be part of an all-day in-person workshop, where over 20 learners formed eight teams that tackled complex problems at the interface of digital image processing, gene expression analysis, and material prediction. Each team was comprised of students and postdoctoral researchers or research scientists from diverse domains including computer science, materials science, and biofilm research. Some projects were selected for presentation at the international IEEE Bioinformatics conference in 2022, with three resulting Machine Learning (ML) models submitted as a journal paper. Students engaged in problem discussions, collaborated with experts from different disciplines, and received guidance in decomposing learning objectives. Based on learner feedback, this successful experience allows for consolidation and integration of convergence research via problem-based learning into the curriculum. Three bioengineering participants, who received training in data science and engineering, have received bioinformatics jobs in biotechnology industries.

2.
Front Microbiol ; 14: 1086021, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37125195

RESUMO

The growth and survival of an organism in a particular environment is highly depends on the certain indispensable genes, termed as essential genes. Sulfate-reducing bacteria (SRB) are obligate anaerobes which thrives on sulfate reduction for its energy requirements. The present study used Oleidesulfovibrio alaskensis G20 (OA G20) as a model SRB to categorize the essential genes based on their key metabolic pathways. Herein, we reported a feedback loop framework for gene of interest discovery, from bio-problem to gene set of interest, leveraging expert annotation with computational prediction. Defined bio-problem was applied to retrieve the genes of SRB from literature databases (PubMed, and PubMed Central) and annotated them to the genome of OA G20. Retrieved gene list was further used to enrich protein-protein interaction and was corroborated to the pangenome analysis, to categorize the enriched gene sets and the respective pathways under essential and non-essential. Interestingly, the sat gene (dde_2265) from the sulfur metabolism was the bridging gene between all the enriched pathways. Gene clusters involved in essential pathways were linked with the genes from seleno-compound metabolism, amino acid metabolism, secondary metabolite synthesis, and cofactor biosynthesis. Furthermore, pangenome analysis demonstrated the gene distribution, where 69.83% of the 116 enriched genes were mapped under "persistent," inferring the essentiality of these genes. Likewise, 21.55% of the enriched genes, which involves specially the formate dehydrogenases and metallic hydrogenases, appeared under "shell." Our methodology suggested that semi-automated text mining and network analysis may play a crucial role in deciphering the previously unexplored genes and key mechanisms which can help to generate a baseline prior to perform any experimental studies.

3.
Microorganisms ; 11(1)2023 Jan 03.
Artigo em Inglês | MEDLINE | ID: mdl-36677411

RESUMO

A significant amount of literature is available on biocorrosion, which makes manual extraction of crucial information such as genes and proteins a laborious task. Despite the fast growth of biology related corrosion studies, there is a limited number of gene collections relating to the corrosion process (biocorrosion). Text mining offers a potential solution by automatically extracting the essential information from unstructured text. We present a text mining workflow that extracts biocorrosion associated genes/proteins in sulfate-reducing bacteria (SRB) from literature databases (e.g., PubMed and PMC). This semi-automatic workflow is built with the Named Entity Recognition (NER) method and Convolutional Neural Network (CNN) model. With PubMed and PMCID as inputs, the workflow identified 227 genes belonging to several Desulfovibrio species. To validate their functions, Gene Ontology (GO) enrichment and biological network analysis was performed using UniprotKB and STRING-DB, respectively. The GO analysis showed that metal ion binding, sulfur binding, and electron transport were among the principal molecular functions. Furthermore, the biological network analysis generated three interlinked clusters containing genes involved in metal ion binding, cellular respiration, and electron transfer, which suggests the involvement of the extracted gene set in biocorrosion. Finally, the dataset was validated through manual curation, yielding a similar set of genes as our workflow; among these, hysB and hydA, and sat and dsrB were identified as the metal ion binding and sulfur metabolism genes, respectively. The identified genes were mapped with the pangenome of 63 SRB genomes that yielded the distribution of these genes across 63 SRB based on the amino acid sequence similarity and were further categorized as core and accessory gene families. SRB's role in biocorrosion involves the transfer of electrons from the metal surface via a hydrogen medium to the sulfate reduction pathway. Therefore, genes encoding hydrogenases and cytochromes might be participating in removing hydrogen from the metals through electron transfer. Moreover, the production of corrosive sulfide from the sulfur metabolism indirectly contributes to the localized pitting of the metals. After the corroboration of text mining results with SRB biocorrosion mechanisms, we suggest that the text mining framework could be utilized for genes/proteins extraction and significantly reduce the manual curation time.

4.
J Mol Biol ; 435(2): 167895, 2023 01 30.
Artigo em Inglês | MEDLINE | ID: mdl-36463932

RESUMO

Micrograph comparison remains useful in bioscience. This technology provides researchers with a quick snapshot of experimental conditions. But sometimes a two- condition comparison relies on researchers' eyes to draw conclusions. Our Bioimage Analysis, Statistic, and Comparison (BASIN) software provides an objective and reproducible comparison leveraging inferential statistics to bridge image data with other modalities. Users have access to machine learning-based object segmentation. BASIN provides several data points such as images' object counts, intensities, and areas. Hypothesis testing may also be performed. To improve BASIN's accessibility, we implemented it using R Shiny and provided both an online and offline version. We used BASIN to process 498 image pairs involving five bioscience topics. Our framework supported either direct claims or extrapolations 57% of the time. Analysis results were manually curated to determine BASIN's accuracy which was shown to be 78%. Additionally, each BASIN version's initial release shows an average 82% FAIR compliance score.


Assuntos
Biofilmes , Disciplinas das Ciências Biológicas , Processamento de Imagem Assistida por Computador , Aprendizado de Máquina , Software , Processamento de Imagem Assistida por Computador/métodos , Fluxo de Trabalho , Conjuntos de Dados como Assunto , Disciplinas das Ciências Biológicas/métodos
7.
Nucleic Acids Res ; 45(D1): D1117-D1122, 2017 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-27924016

RESUMO

Bioinformatics and computational biology play a critical role in bioscience and biomedical research. As researchers design their experimental projects, one major challenge is to find the most relevant bioinformatics toolkits that will lead to new knowledge discovery from their data. The Bio-TDS (Bioscience Query Tool Discovery Systems, http://biotds.org/) has been developed to assist researchers in retrieving the most applicable analytic tools by allowing them to formulate their questions as free text. The Bio-TDS is a flexible retrieval system that affords users from multiple bioscience domains (e.g. genomic, proteomic, bio-imaging) the ability to query over 12 000 analytic tool descriptions integrated from well-established, community repositories. One of the primary components of the Bio-TDS is the ontology and natural language processing workflow for annotation, curation, query processing, and evaluation. The Bio-TDS's scientific impact was evaluated using sample questions posed by researchers retrieved from Biostars, a site focusing on BIOLOGICAL DATA ANALYSIS: The Bio-TDS was compared to five similar bioscience analytic tool retrieval systems with the Bio-TDS outperforming the others in terms of relevance and completeness. The Bio-TDS offers researchers the capacity to associate their bioscience question with the most relevant computational toolsets required for the data analysis in their knowledge discovery process.


Assuntos
Biologia Computacional/métodos , Bases de Dados Factuais , Software , Sistemas de Gerenciamento de Base de Dados , Genômica/métodos , Anotação de Sequência Molecular , Proteômica/métodos , Reprodutibilidade dos Testes , Navegador , Fluxo de Trabalho
8.
Nucleic Acids Res ; 39(Web Server issue): W528-32, 2011 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-21546552

RESUMO

The BioExtract Server (bioextract.org) is an open, web-based system designed to aid researchers in the analysis of genomic data by providing a platform for the creation of bioinformatic workflows. Scientific workflows are created within the system by recording tasks performed by the user. These tasks may include querying multiple, distributed data sources, saving query results as searchable data extracts, and executing local and web-accessible analytic tools. The series of recorded tasks can then be saved as a reproducible, sharable workflow available for subsequent execution with the original or modified inputs and parameter settings. Integrated data resources include interfaces to the National Center for Biotechnology Information (NCBI) nucleotide and protein databases, the European Molecular Biology Laboratory (EMBL-Bank) non-redundant nucleotide database, the Universal Protein Resource (UniProt), and the UniProt Reference Clusters (UniRef) database. The system offers access to numerous preinstalled, curated analytic tools and also provides researchers with the option of selecting computational tools from a large list of web services including the European Molecular Biology Open Software Suite (EMBOSS), BioMoby, and the Kyoto Encyclopedia of Genes and Genomes (KEGG). The system further allows users to integrate local command line tools residing on their own computers through a client-side Java applet.


Assuntos
Genômica/métodos , Software , Bases de Dados Genéticas , Internet , Integração de Sistemas , Fluxo de Trabalho
9.
J Mol Neurosci ; 44(1): 53-8, 2011 May.
Artigo em Inglês | MEDLINE | ID: mdl-21416271

RESUMO

Autosomal recessive spastic ataxia of Charlevoix-Saguenay is a distinct form of hereditary early-onset spastic ataxia caused by cerebellum and spinal cord degeneration. The SACS gene has been demonstrated to be responsible for the disease through worldwide description of different mutations. We report here a computational analysis of a novel SACS gene mutation identified in a Tunisian family, using workflow implemented on the BioExtract Server. Several online computational tools are currently available to explore the effect of novel identified mutations in human and other organisms. Such analysis is time-consuming and generates a batch of files that researchers need to extract and save. The BioExtract Server workflow described here offers an easy way to execute the required tools together, avoiding entering queries independently in each web tool or service.


Assuntos
Biologia Computacional/métodos , Análise Mutacional de DNA/métodos , Proteínas de Choque Térmico/genética , Mutação , Simulação por Computador , Sistemas Computacionais , Análise Mutacional de DNA/instrumentação , Humanos , Espasticidade Muscular/genética , Linhagem , Fenótipo , Ataxias Espinocerebelares/congênito , Ataxias Espinocerebelares/genética , Tunísia
10.
Int J Plant Genomics ; 2011: 923035, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-22253616

RESUMO

The purpose of the online resource presented here, POPcorn (Project Portal for corn), is to enhance accessibility of maize genetic and genomic resources for plant biologists. Currently, many online locations are difficult to find, some are best searched independently, and individual project websites often degrade over time-sometimes disappearing entirely. The POPcorn site makes available (1) a centralized, web-accessible resource to search and browse descriptions of ongoing maize genomics projects, (2) a single, stand-alone tool that uses web Services and minimal data warehousing to search for sequence matches in online resources of diverse offsite projects, and (3) a set of tools that enables researchers to migrate their data to the long-term model organism database for maize genetic and genomic information: MaizeGDB. Examples demonstrating POPcorn's utility are provided herein.

11.
Artigo em Inglês | MEDLINE | ID: mdl-20150665

RESUMO

Many in silico investigations in bioinformatics require access to multiple, distributed data sources and analytic tools. The requisite data sources may include large public data repositories, community databases, and project databases for use in domain-specific research. Different data sources frequently utilize distinct query languages and return results in unique formats, and therefore researchers must either rely upon a small number of primary data sources or become familiar with multiple query languages and formats. Similarly, the associated analytic tools often require specific input formats and produce unique outputs which make it difficult to utilize the output from one tool as input to another. The BioExtract Server (http://bioextract.org) is a Web-based data integration application designed to consolidate, analyze, and serve data from heterogeneous biomolecular databases in the form of a mash-up. The basic operations of the BioExtract Server allow researchers, via their Web browsers, to specify data sources, flexibly query data sources, apply analytic tools, download result sets, and store query results for later reuse. As a researcher works with the system, their "steps" are saved in the background. At any time, these steps can be preserved long-term as a workflow simply by providing a workflow name and description.


Assuntos
Biopolímeros/química , Mineração de Dados/métodos , Sistemas de Gerenciamento de Base de Dados , Bases de Dados Factuais , Disseminação de Informação/métodos , Internet , Software , Biopolímeros/classificação , Biopolímeros/fisiologia , Biologia Computacional/métodos , Fluxo de Trabalho
12.
Nucleic Acids Res ; 36(Database issue): D959-65, 2008 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-18063570

RESUMO

PlantGDB (http://www.plantgdb.org/) is a genomics database encompassing sequence data for green plants (Viridiplantae). PlantGDB provides annotated transcript assemblies for >100 plant species, with transcripts mapped to their cognate genomic context where available, integrated with a variety of sequence analysis tools and web services. For 14 plant species with emerging or complete genome sequence, PlantGDB's genome browsers (xGDB) serve as a graphical interface for viewing, evaluating and annotating transcript and protein alignments to chromosome or bacterial artificial chromosome (BAC)-based genome assemblies. Annotation is facilitated by the integrated yrGATE module for community curation of gene models. Novel web services at PlantGDB include Tracembler, an iterative alignment tool that generates contigs from GenBank trace file data and BioExtract Server, a web-based server for executing custom sequence analysis workflows. PlantGDB also hosts a plant genomics research outreach portal (PGROP) that facilitates access to a large number of resources for research and training.


Assuntos
Bases de Dados Genéticas , Genoma de Planta , Genes de Plantas , Genômica , Internet , Proteínas de Plantas/química , Proteínas de Plantas/genética , RNA Mensageiro/química , Alinhamento de Sequência , Software , Interface Usuário-Computador
13.
Int J Comput Biol Drug Des ; 1(3): 302-12, 2008.
Artigo em Inglês | MEDLINE | ID: mdl-20054995

RESUMO

Computational workflows in bioinformatics are becoming increasingly important in the achievement of scientific advances. These workflows typically require the integrated use of multiple, distributed data sources and analytic tools. The BioExtract Server (http://bioextract.org) is a distributed service designed to provide researchers with the web ability to query multiple data sources, save results as searchable data sets, and execute analytic tools. As the researcher works with the system, their tasks are saved in the background. At any time these steps can be saved as a workflow that can then be executed again and/or modified later.


Assuntos
Biologia Computacional/métodos , Redes de Comunicação de Computadores , Biologia Computacional/estatística & dados numéricos , Sistemas Computacionais , Bases de Dados Genéticas , Internet , Design de Software
14.
Plant Physiol ; 139(2): 610-8, 2005 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-16219921

RESUMO

PlantGDB (http://www.plantgdb.org/) is a database of plant molecular sequences. Expressed sequence tag (EST) sequences are assembled into contigs that represent tentative unique genes. EST contigs are functionally annotated with information derived from known protein sequences that are highly similar to the putative translation products. Tentative Gene Ontology terms are assigned to match those of the similar sequences identified. Genome survey sequences are assembled similarly. The resulting genome survey sequence contigs are matched to ESTs and conserved protein homologs to identify putative full-length open reading frame-containing genes, which are subsequently provisionally classified according to established gene family designations. For Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa), the exon-intron boundaries for gene structures are annotated by spliced alignment of ESTs and full-length cDNAs to their respective complete genome sequences. Unique genome browsers have been developed to present all available EST and cDNA evidence for current transcript models (for Arabidopsis, see the AtGDB site at http://www.plantgdb.org/AtGDB/; for rice, see the OsGDB site at http://www.plantgdb.org/OsGDB/). In addition, a number of bioinformatic tools have been integrated at PlantGDB that enable researchers to carry out sequence analyses on-site using both their own data and data residing within the database.


Assuntos
Bases de Dados Genéticas , Genoma de Planta , Plantas/genética , Biologia Computacional , DNA de Plantas/genética , Etiquetas de Sequências Expressas , Genômica , Sequências Repetitivas de Ácido Nucleico , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...